Parallel structurally-symmetric sparse matrix-vector products on multi-core processors

نویسندگان

Vicente H. F. Batista

George O. Ainsworth

Fernando L. B. Ribeiro

چکیده

We consider the problem of developing an efficient multi-threaded implementation of the matrix-vector multiplication algorithm for sparse matrices with structural symmetry. Matrices are stored using the compressed sparse row-column format (CSRC), designed for profiting from the symmetric non-zero pattern observed in global finite element matrices. Unlike classical compressed storage formats, performing the sparse matrix-vector product using the CSRC requires thread-safe access to the destination vector. To avoid race conditions, we have implemented two partitioning strategies. In the first one, each thread allocates an array for storing its contributions, which are later combined in an accumulation step. We analyze how to perform this accumulation in four different ways. The second strategy employs a coloring algorithm for grouping rows that can be concurrently processed by threads. Our results indicate that, although incurring an increase in the working set size, the former approach leads to the best performance improvements for most matrices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithmic patterns for H-matrices on many-core processors

In this work, we consider the reformulation of hierarchical (H) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H matrix oper...

متن کامل

GAMS Index for the NAG Parallel Library

C Elementary and special functions (search also class L5 ) C1 Integer-valued functions (e.g., factorial, binomial coefficient, permutations, combinations, floor, ceiling) C06GXFP Factorizes a positive integer n as n = n1 × n2. This routine may be used in conjunction with C06MCFP D Linear Algebra D1 Elementary vector and matrix operations D1a Elementary vector operations D1a1 Set to constant D1a...

متن کامل

Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

متن کامل

Towards a fast parallel sparse matrix-vector multiplication

The sparse matrix-vector product is an important computational kernel that runs ineffectively on many computers with super-scalar RISC processors. In this paper we analyse the performance of the sparse matrix-vector product with symmetric matrices originating from the FEM and describe techniques that lead to a fast implementation. It is shown how these optimisations can be incorporated into an ...

متن کامل

On Parallel Solution of Sparse Triangular Linear Systems in CUDA

The acceleration of sparse matrix computations on modern many-core processors, such as the graphics processing units (GPUs), has been recognized and studied over a decade. Significant performance enhancements have been achieved for many sparse matrix computational kernels such as sparse matrix-vector products and sparse matrix-matrix products. Solving linear systems with sparse triangular struc...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1003.0952 شماره

صفحات -

تاریخ انتشار 2010

Parallel structurally-symmetric sparse matrix-vector products on multi-core processors

نویسندگان

چکیده

منابع مشابه

Algorithmic patterns for H-matrices on many-core processors

GAMS Index for the NAG Parallel Library

Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors

Towards a fast parallel sparse matrix-vector multiplication

On Parallel Solution of Sparse Triangular Linear Systems in CUDA

عنوان ژورنال:

اشتراک گذاری